Hybridized KNN and SVM for gene expression data classification

نویسندگان

Zhen Mei

Qi Shen

Baoxian Ye

چکیده

Support vector machine (SVM) is one of the most powerful supervised learning algorithms in gene expression analysis. The samples intermixed in another class or in the overlapped boundary region may cause the decision boundary too complex and may be harmful to improve the precise of SVM. In the present paper, hybridized k-nearest neighbor (KNN) classifiers and SVM (HKNNSVM) is proposed to deal with the problem of samples in the overlapped boundary region and to improve the performance of SVM. The first KNN is used to prune training samples and the second KNN is combined with SVM to classify the cancer samples. The proposed algorithm was used in binary and multiclass classification of gene expression data. The results were compared to those obtained by single SVM and KNN. It has been demonstrated that the proposed method is a useful tool for classification and the misclassification rate for the prediction set is reduced with samples pruning used. Compared with SVM and KNN, the misclassification rates of HKNNSVM for the datasets containing mislabeled samples were notably lower than that by SVM and KNN, which indicated that the classification performance of HKNNSVM was stable. [Life Science Journal. 2009; 6(1): 61 – 66] (ISSN: 1097 – 8135).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

Gene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method

Background: Microarray experiments can simultaneously determine the expression of thousands of genes. Identification of potential genes from microarray data for diagnosis of cancer is important. This study aimed to identify genes for the diagnosis of acute myeloid and lymphoblastic leukemia using a sparse feature selection method. Materials and Methods: In this descriptive study, the expressio...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

Performance Analysis of Genetic Algorithm with kNN and SVM for Feature Selection in Tumor Classification

Abstract—Tumor classification is a key area of research in the field of bioinformatics. Microarray technology is commonly used in the study of disease diagnosis using gene expression levels. The main drawback of gene expression data is that it contains thousands of genes and a very few samples. Feature selection methods are used to select the informative genes from the microarray. These methods...

متن کامل

Prediction of blood cancer using leukemia gene expression data and sparsity-based gene selection methods

Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsity-based gene selection method. Materials and Methods: In this descriptive study, the microarray ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Hybridized KNN and SVM for gene expression data classification

نویسندگان

چکیده

منابع مشابه

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

Gene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Performance Analysis of Genetic Algorithm with kNN and SVM for Feature Selection in Tumor Classification

Prediction of blood cancer using leukemia gene expression data and sparsity-based gene selection methods

عنوان ژورنال:

اشتراک گذاری